Devnagari document segmentation using histogram approach
نویسندگان
چکیده
Document segmentation is one of the critical phases in machine recognition of any language. Correct segmentation of individual symbols decides the accuracy of character recognition technique. It is used to decompose image of a sequence of characters into sub images of individual symbols by segmenting lines and words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents consist of vowels, consonants and various modifiers. Hence proper segmentation of Devnagari word is challenging. A simple histogram based approach to segment Devnagari documents is proposed in this paper. Various challenges in segmentation of Devnagari script are also discussed.
منابع مشابه
Multiple Classifier Combination for Off-line Handwritten Devnagari Character Recognition
This work presents the application of weighted majority voting technique for combination of classification decision obtained from three Multi_Layer Perceptron(MLP) based classifiers for Recognition of Handwritten Devnagari characters using three different feature sets. The features used are intersection, shadow feature and chain code histogram features. Shadow features are computed globally for...
متن کاملDevnagari Numerals Classification and Recognition Using an Integrated Approach
Character recognition has always been a challenging field for the researchers. There has been an astounding progress in the development of the systems for character recognition. OCR performs the recognition of the text in the scanned document image and converts it into editable form. The OCR process can have several stages like preprocessing, segmentation, recognition and post processing. The r...
متن کاملApplication of Statistical Features in Handwritten Devnagari Character Recognition
In this paper a scheme for offline Handwritten Devnagari Character Recognition is proposed, which uses different feature extraction methodologies and recognition algorithms. The proposed system assumes no constraints in writing style or size. First the character is preprocessed and features namely : Chain code histogram and moment invariant features are extracted and fed to Multilayer Perceptro...
متن کاملIndus Image Segmentation Using Watershed and Histogram Projections
Character segmentation is the major step of document image analysis and optical character recognition (OCR). The character segmentation is necessary to detect all the character regions in the image document. The proposed method preprocesses the image document with edge detection techniques to enhance the character edges. Further, the watershed algorithm is implemented to identify the regions of...
متن کاملRecognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier
Recognition of handwritten characters is a challenging task because of the variability involved in the writing styles of different individuals. In this paper we propose a quadratic classifier based scheme for the recognition of offline Devnagari handwritten characters. The features used in the classifier are obtained from the directional chain code information of the contour points of the chara...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1109.1247 شماره
صفحات -
تاریخ انتشار 2011